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Abstract 

The nonlinear Helmholtz equation (NLH) models the propagation of electromagnetic 
waves in Ken - media, and describes a range of important phenomena in nonlinear optics and 
in other areas. In our previous work, we developed a fourth order method for its numerical 
solution that involved an iterative solver based on freezing the nonlinearity. The method 
enabled a direct simulation of nonlinear self-focusing in the nonparaxial regime, and a 
quantitative prediction of backscattering. However, our simulations showed that there is a 
threshold value for the magnitude of the nonlinearity, above which the iterations diverge. 

In this study, we numerically solve the one-dimensional NLH using a Newton-type non- 
linear solver. Because the Kerr nonlinearity contains absolute values of the field, the NLH 
has to be recast as a system of two real equations in order to apply Newton's method. Our 
numerical simulations show that Newton's method converges rapidly and, in contradistinc- 
tion with the iterations based on freezing the nonlinearity, enables computations for very 
high levels of nonlinearity. 

In addition, we introduce a novel compact finite-volume fourth order discretization for 
the NLH with material discontinuities. Our computations corroborate the design fourth 
order convergence of the method. 

The one-dimensional results of the current paper create a foundation for the analysis of 
multi-dimensional problems in the future. 
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1 Introduction 



1.1 Background 



The nonlinear Helmholtz equation (NLH) 



A£(x) 



°n 2 E = 0, n 2 (x,|£|) 



2n (x)n 2 (x)|£| 



(1) 



governs the propagation of linearly-polarized, time-harmonic electromagnetic 
waves in Kerr-type dielectrics. Here, x = [xi, . . . , Xjj] are the spatial coordinates, 
E — E(x) denotes the scalar electric field, u is the laser frequency, c is the speed 
of light in vacuum, A = d%. 1 + . . . + d 2 D is the .D-dimensional Laplacian, n is the 
linear index of refraction, and n 2 is the Kerr coefficient. In this study, we consider 
the case of an inhomogeneous medium in which both n and n 2 can vary in space. 
We assume that the medium is lossless, i.e., that n and n 2 are real. Furthermore, 
we consider only the case in which the electric field E and the material coefficients 
n and n 2 vary in one spatial direction that we identify with the direction of prop- 
agation and denote by z. Hence, equation (OQ) reduces to the one-dimensional cubic 
NLH: 



dPE(z) 



5 (*) 



2n (z)n 2 (z) \E\ Z ) E = 



(2) 



Grated Kerr medium 



Incoming field 
rpO p ik z 



Reflected field 

Re ik Z 



Transmitted 
field 

gik Z 



dz 2 C 
The ordinary differential equation © 
arises, for example, when model- 
ing nonlinear optical devices, such 
as the Fabry-Perot etalon [1], see 
Figure [Q This device consists of a 
layer or slab of Kerr medium lo- 
cated between < z < Z max . The 
Kerr slab is surrounded by a lin- 
ear homogeneous medium, so that 
n = n e Q l and n 2 = for z < and 
for z > Z max . We consider the case 

when an incoming plane wave E = Ef RC e tk ° z impinges normally on the slab at the 



-+- 



Figure 1. A grated Fabry-Perot etalon. 



interface z — 0. Here, k 



^ n ext 



is the linear wavenumber in the surrounding 
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linear medium. Let us define 



v{z) = (n (z)/n^) 2 , e(z) = 2n 2 (z)n (z) / \nff . 
Then, equation © transforms into 

^^ + kl(u(z)+e(z)\Ef)E = 0, (3) 

where v = 1 and e = for z < and for z > Z max . 

We assume that the Kerr material is either homogeneous, i.e., 

v{z) = v m \ e(z) = ^\ 0<z<Z max , (4) 

or layered (piecewise-constant). The latter case corresponds to a one-dimensional 
grating (see Figured)), where for some given partition: 

= z x < ■ ■ ■ < Zi < ■ ■ ■ < z L = Z max , (5a) 

we have: 

v(z) = v h e(z) =e h for z e {z h z l+1 ) . (5b) 

At the interfaces zi, the boundary conditions for Maxwell's equations imply con- 
tinuity of the field E(z) and its first derivative (see Appendix lAl). Note that, 
the material coefficients u(z) and e(z) are, generally speaking, discontinuous at the 
Kerr medium boundaries z = and z = Z max . 

When equation © is considered on the interval < z < Z max , it needs to be 
supplemented by boundary conditions at z = and z = Z max . Outside of this 
interval, the field propagates linearly with v = 1 and e = 0. Therefore, for z < 0, 
the total field is composed of a given incoming wave and the unknown reflected 
wave 

E(z) = E? nc e ikoZ + Re~ ikoZ . (6a) 
For z > Z max , the electric field is given by the transmitted wave 

E(z) = Te ikoz . (6b) 

The transmitted and reflected waves shall be interpreted as outgoing with respect to 
the domain of interest [0, Z max \. Note that the left-traveling wave Re~ lk ° z contains 
the field reflected from the interface z = (i.e., the reflection per se), as well as the 
field generated by nonlinear backscattering inside the interval [0, Z max \. 

The transmitted field (l6b~l) satisfies a Sommerfeld-type homogeneous differential 
relation at z = Z max +: 



I - A e 



z=Z u 



j- z ~ ik o) T e ikoz 



z=Z n 
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Hence, continuity of E and at z = Z max yields the following boundary condi- 
tion: 

/ d 



\dz-*>) E 
Similarly, at z = 0— we can write, see ( l6al ): 



= 0. (7a) 

2— 2max 




z=0~ 



z=0- 



Hence, the continuity of E and ^ at z = leads to the boundary condition: 




2ik Q El c . (7b) 



2=0 



The boundary conditions (P7al) and ([TBI) enable the propagation of outgoing waves 
from inside the interval [0, Z max ] toward its exterior. In addition, the boundary con- 
dition (|7bl) prescribes the given incoming wave E^ nc e lk ° z at the left boundary z — 0, 
and is therefore referred to as the two-way boundary condition. 

The problem ©, © can be rescaled as follows: 

E = E/E° nc , e = e|£'° c | 2 . 

Hence, we can assume hereafter with no loss of generality that 

El = 1- (8) 

Under this rescaling, a variation in e represents a variation in the input beam 
power |£P C | 2 . 

Closed form solutions for equation © in a homogeneous medium © were first ob- 
tained by Wilhelm [2] for a real-valued field, and by Marburger and Felber [3] for 
a complex-valued field. These solutions were later used by Chen and Mills [4] to 
solve equation ([3]) with the boundary conditions ©, as follows. Since the NLH © 
is a second order ODE, the boundary condition (P7al) at z = Z max , together with a 
choice of the transmitted field amplitude T, constitute an initial value problem at 
z — Z max that has a unique solution E = E(z; T, e). For an arbitrary value of 
T, the solution E(z; T) does not, generally speaking, satisfy the boundary condi- 
tion (l7bl) at z = 0. One can therefore use a shooting approach to find the value(s) of 
T = T(e) for which the solutions of the initial value problem also satisfy (TTbl) [and 
hence the full problem ©, ©, ©]. When the nonlinearity e is small, the function 
T = T(e) is single- valued, see Figure (2tA). When the nonlinearity exceeds a cer- 
tain threshold e > e c , the function T = T(e) becomes multi-valued, which implies 
nonuniqueness of the solution. The nonuniqueness occurs at certain intervals of e 



Note that as \ E(Z niax )\ = \T\,a choice of T is equivalent to a choice of E at z = Z n 
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Figure 2. (A) The transmittance \T\ as a function of e for the solution of the one-dimen- 
sional NLH © with v = 1, ko = 8 and Z ms , x = 10. (B) Zoom-in on the first region of 
switchback-type nonuniqueness for 0.7234 « e c < e < e' c « 0.7249. 

and is of a switchback type, see Figure EtB). In the physics literature, this behavior 
is often referred to as bi stability. 



In a subsequent paper [5], Chen and Mills extended their approach to the case of 
piecewise-constant material coefficients ©, which corresponds to the formulation 
that we analyze numerically in this paper, see Section [L2l Knapp, Papanicolaou and 
White [6] considered the case of a large homogeneous slab and a weak nonlinearity. 
They showed that the threshold for nonuniqueness e c scales as Z~l x . They also 
treated random media, which we do not consider here. 

In addition to analytical studies, equation © was also studied numerically using 
a shooting approach [7-10] which is conceptually similar to the one of Chen and 
Mills [4,5]. Unlike [4,5], however, in these studies, for each value of T at Z max the 
Cauchy problem is solved numerically, rather than analytically. The advantage of 
this approach over [4,5] is that it can be applied to media with a smooth variation of 
material properties [7-10] and to lossy materials [7], as opposed to only piecewise- 
constant media in [4,5]. The main shortcoming of the shooting approach, however, 
is that it cannot be generalized to multidimensional problems. 

The NLH can also be solved numerically as a full boundary value problem. In our 
previous work [11-13], we solved the multidimensional NLH (OQ) for the homoge- 
neous Kerr medium with u = 1 and e = const 0] To do that, we developed and 
implemented nonlocal two-way boundary conditions similar to ©; they provided a 
key element of the numerical methodology. In [14, 15], Suryanto et al. used a finite 
element scheme for solving the one-dimensional NLH ([3]) subject to the two-way 
boundary conditions. The finite element approximation constructed in [14, 15] al- 
lowed for material discontinuities at the grid nodes. This approximation was of a 
mixed order; the linear terms of © were approximated with fourth order accuracy, 
while the nonlinearity was approximated with second order accuracy. 



3 Note that the v = 1 corresponds to the case for which the linear index of refraction n (x) 
is the same both inside and outside the Kerr medium. 
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Let us emphasize that at the points where the material coefficients v and/or e are dis- 
continuous, the second derivative of the solution E{z) is discontinuous. The pres- 
ence of discontinuities in the solution must be properly accounted for when build- 
ing a numerical approximation of equation ©. In particular, a naive high-order 
approximation may lose its accuracy as the grid is refined. In this context, we note 
that the coefficient e is always discontinuous at least at z = and z = Z max . Such a 
discontinuity cannot be addressed by a scheme that assumes smoothness across the 
boundary, such as the standard (five-point) fourth order central-difference scheme 
used in our previous work [11-13]. Indeed, we have observed in [13] a deterioration 
of the fourth order accuracy at fine grid resolutions. 

In the current paper, we present a novel fourth order numerical scheme for the 
NLH ([3]) based on a compact approximation of finite volume type. The use of inte- 
gration over the grid cells allows us to correctly account for the discontinuities in 
v[z) and e(z) both at the outer boundaries and inside the Kerr medium. The fourth 
order accuracy is attained on a compact three node stencil by using the differen- 
tial equation © to eliminate the leading terms of the truncation error. A similar 
equation-based approach was used by Singer and Turkel in [16] to obtain a com- 
pact high order approximation for the linear Helmholtz equation. As we shall see, 
however, construction of a compact approximation for finite volumes, and espe- 
cially in the nonlinear case is considerably more complex. In particular, we need 
to use Birkhoff-Hermite interpolation to approximate the field between the grid 
nodes with fourth order accuracy. To the best of our knowledge, this is the first time 
ever that a genuine fourth order scheme is built for the NLH with discontinuous 
coefficients. 

While we analyze the formal accuracy of our schemes, a theoretical error estimate 
is beyond the scope of this paper, because the problem is nonlinear. Instead, we 
evaluate the numerical error experimentally, and demonstrate that the schemes pos- 
sess the anticipated rate of convergence. Moreover, in Appendix [B] we provide a 
convergence proof for a linear problem with a material discontinuity, in which the 
material coefficient v is in the form of a step function. In this case, we can obtain 
closed form solutions for both the continuous equation and its discrete counterpart, 
and use them to establish the error estimates. Note that this simple setup captures 
the key features of our treatment of material discontinuities by finite volumes, and 
illustrates that the scheme indeed has the design rate of grid convergence. 

The second key improvement offered by the current paper is in the methodology 
used to solve the nonlinear equations on the grid. Previously [11-13], we solved 
the NLH by simple iterations based on freezing the nonlinearity; a similar approach 
was also employed by Suryanto et al. in [14, 15]. While this approach has allowed us 
to obtain a number of interesting solutions to the NLH with a weak nonlinearity, for 
somewhat stronger nonlinearities the iterations would cease to converge [11-15]. 
In order to overcome this limitation, in this paper we solve the NLH ([3]) using 
Newton's iterations. Applying Newton's method to the NLH is not straightforward 
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though, since the nonlinearity in © is nondifferentiable in the sense of Frechet. 
We recall that the solutions of the NLH © must be complex valued, otherwise it is 
impossible to adequately describe traveling waves in the time-harmonic context^ 
Hence, to obtain a proper Newton's linearization we recast the complex equation © 
as a system of two real equations. In the literature, Newton's method has been ap- 
plied to similar problems. For example, in the work of Gomez- Gardenes, et al. [17], 
the authors solve the steady-state nonlinear Schrodinger equation on a lattice by 
Newton's method (see also [18-21]). Our particular implementation of Newton's 
method for the NLH leads to a block tridiagonal structure of the Jacobians, which 
enables an efficient inversion. We also note that the application of Newton's method 
to a higher order discretization of the the NLH with material discontinuities brings 
along additional complications (Section©. 

Our computations show that the use of Newton's iterations leads to a very consider- 
able improvement in performance over the previous "frozen-nonlinearity" iterative 
methods [11—15], as it enables robust numerical solution of the NLH for strong 
nonlinearities. In fact, solutions can be computed for nonlinearities far above the 
threshold of nonuniqueness, and even for the nonlinearities that lead to material 
breakdown in an actual physical setting. Note that in the latter case, the Kerr model 
itself becomes inapplicable. 

The paper is organized as follows: In Section [L"2l we present a summary of the math- 
ematical formulation. In Section [2l we describe our discrete approximation. We 
begin with the finite volume formulation (Section 12711) . then introduce two second 
order approximations (Section 12.21 and Section 12.31 ) and the fourth order approxi- 
mation (Section |2~4l) , and finally construct the boundary conditions in the discrete 
setting (Section [231) . In Section© we build a Newton's solver for the Frechet non- 
differentiable NLH. To clarify the presentation, we first illustrate the approach for 
a single variable (Section 13.11) . then generalize to multivariable nondifferentiable 
functions (Section [3721) . apply the method to the three discrete approximations of 
the NLH (Section 13.31 ), and finally discuss the choice of the initial guess (Sec- 
tion 13.41) . A summary of the numerical method is given in Section |4j Numerical 
computations are performed in Section [51 examining the convergence of the iter- 
ations and the computational error of the methods (Section 15.21 and Section 15 .31 
respectively). We conclude with a discussion in Section [61 

1 . 2 Summary of the formulation 

In the current paper, we will be solving the one-dimensional NLH [cf . ©] : 

+ k 2 (y{z) + e(z) \E\ 2 ) E = 0, 0<z< Z max , (9a) 
4 This is reflected by the fact that the boundary conditions Q are complex. 
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subject to the boundary conditions [cf. (ITU) . (l7bl)1: 

= 0. (9b) 

In formulae (|9al) , (l9b~l) , we assume the scaling _E? C = 1, see ©. The medium on the 
interval [0, Z max ] can have piecewise-constant material coefficients: 

u(z) = P h e(z) = ei, for z G (zi,k+i) ■ (9c) 

For simplicity only, we assume a uniform partition into L — 1 homogeneous slabs 
of equal width Az = ^f: 

zi = (l-l)Az, l = l,...,L. (9d) 

The homogeneous case ([4]) corresponds to the case L = 2. At the interfaces 5;, the 
solution E(z) and its first derivative are continuous, but the second derivative 
is discontinuous. Away from the interfaces, i.e., inside every interval (|9cl) , the 
material coefficients 1/(2) and e(z) are constant, and the NLH ( |9al ) implies that the 
field -E(z) is infinitely differentiable. 



2 Discrete approximation 

In this section, we present our discretization of problem ©. First, we introduce an 
integral formulation of the NLH (|9a|) (see Section [OT) and discretize it on the grid 
(Section [2T2l I2.3L and 12 .41 ) . Then, we implement the boundary conditions (|9bl in a 
fully discrete framework (Section [231 ) . 

2.1 Integral formulation 

Let a, b G [0, Z max ], a < b, and let us integrate equation (l9al) between the points a 
and b with respect to z. Since 4^ is continuous everywhere, we obtain: 

dz J a K ' 

Equation (TTOl) can be interpreted as the integral conservation law that corresponds 
to the NLH (l9al) . It is easy to see that for sufficiently smooth solutions the two 
formulations are equivalent. Indeed, if we require that the integral relation (flOl) 
hold for any pair of points a and b, then at every point z where exists the 
NLH (l9ab can be reconstructed from the conservation law (flOl) by a straightforward 
passage to the limit: a — > z — 0, 6 — > + 0. However, the integral formulation (flOl) 



2ikn 



ik E 



2=0 
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makes sense even when the differential equation per se loses its validity because of 
insufficient regularity of the solution, i.e., when the material coefficients undergo 
jump discontinuities and the second derivative becomes discontinuous. 

Let us introduce a uniform grid of M nodes on the interval < z < Z max : 

Z 

z m = (m — where h = TF~[> m = l,...,M. (Ha) 



We choose h so that Az of (I9d1) is an integer multiple of h. This choice guarantees 
that material discontinuities will only be located at the grid nodes, i.e., that both 
u(z) and e(z) will be constant within each grid cell: 

v(z) = v m , e(z) = e m , z G (z m , z m+1 ) . (lib) 



To approximate the NLH on the grid (111 ah . we apply the integral relation (flOl) be- 
tween the midpoints of every two neighboring cells, i.e., for [a, b] = [z m _i, z m+ i], 
m — 1,2, ... , M. Then, using formula (11 lbL we arrive at 



dE 

dz 



2 2 f Zm 2 f Zm 1 2 

+ k v m ^i I Edz + k e m -i / \E\ E dz 

J Z 1 J Z 1 

I m—Tj m—Tr 

Z A 

+ k 2 v m p n+h Edz + k 2 e m j Zm+h \E\ 2 Edz = 0. 



(12) 



Equation (TT2l) relates integrals of the unknown continuous function E(z) with its 
derivatives at z = z m± i. We will approximate the individual terms in (fT2l) using 
the nodal values E(z m ) = E m , m = 1, . . . , M, of the field. The resulting scheme 
will be equivalent to compact finite differences on the regions of smoothness of the 
solution, where it could also be obtained without using the integral formulation (see 
Section loTTI for further discussion of an approach alternative to the use of integral 
formulation). Otherwise, i.e., near the discontinuities, the scheme will approximate 
the integral relation (flOl) . and hence (fT2l) . rather than the differential equation ( l9al) . 



Recall that the material coefficients u(z) and e(z) are constant in between the 
grid nodes and consequently, E(z) is infinitely differentiable within each grid cell. 
Hence, all the integrands in (fT2l) can be approximated with fourth order accuracy 
using cubic polynomials. Together with a fourth order approximation of the deriva- 
tives, this yields a fourth order compact scheme for the NLH (|9al ), see Section [2~4l 
An even simpler piecewise linear approximation of E{z) yields a second order 
compact scheme, and we will describe its two different versions, in Sections 12.21 
and !2.3l In addition to providing a reference point for comparison, the second order 
schemes allow us to introduce the general framework and notations exploited later 
for building the more complex fourth order method. 
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2.2 Second order approximation 



We approximate the first term on the left-hand side of (1121) using central differences: 

E m E m _i 



dE 

dz 



Z 1 
m- 2 



E m +i — E n 

h 



h 



Oh 



(13) 



Without assuming any additional regularity of E(z) beyond the continuity of its 
first derivative, we merely have the difference of two fluxes approximated with 
second order accuracy^] If, however, the material coefficients are continuous at z m , 
i.e., if v m = v m -i and e m = e m _ ls then ^-f and higher derivatives exist and are 
continuous as well. In this case, if we divide the undivided second difference on 
the right-hand side of (IT3T) by h, then a straightforward Taylor-based argument will 
yield a second order central-difference approximation of : 



E 



m+l 



2 E m + E, 



m— 1 



d 2 E 
dz 2 



+ h 



(14) 



To approximate the third integral on the left-hand side of (1121) . we linearly interpo- 
late E( z) on the interval [z m , z m _^_i\'. 



E{z) =E(z m + kC) = (l-C)E m + (E m+1 + O (h 2 ) , C e [o, - 
Then, substituting expression (TT5T) into the third integral of (fT2l) . we have: 

[ Zm+i Edz = h j 1 ' 2 [(1 - C)E m + (E m+1 ) dC + 0(h 3 ) 

J Z m JO 

= -rrE m + —E m+ i + 0(h 3 ). 
o o 



(15) 



(16) 



Likewise, we can linearly interpolate the cubic term \E\ 2 E on [z m , z m+ i] to obtain: 
[ +1 \E\ 2 E dz = — \E m \ 2 E m + — l^m+il 2 E m+ i + 0(h 3 ). 

Jz m O 

The expressions for the subinterval [z m _i,z m ] are derived similarly, we merely 
replace u m , e m , and E m+ i with z/ m _i, e m _i, and E m -i, respectively. Finally, by 
assembling all the terms we arrive at the following second order approximation of 



5 The flux difference on the right-hand side of (IT3T > is exactly the same as we would have 
obtained if we approximated the second derivative by the standard piecewise linear 
Galerkin finite elements, see, e.g., [22]; having a continuous first derivative of E{z) is 
sufficient for building this approximation. 
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the integral relation (fT2l) for m = 1,2, . . . , M: 



hF m (E) 



dcf-^m+l 



+ hk%u m _x 



E E 

E m -1 



E„ 



m—l 



h 

3E ri 



r r.2 3E m + E m j_i 



\E m —i\ E m —i + 3 | | E n 



(17a) 



'm- 1 



, i i 2 ^ I -Em I E m 4~ l-Bm+il E m +\ 
+ AiKnem ^ = 0- 



The vector E = [E±, . . . , Em] T was used as an argument of F m (E) in for- 
mula (I17al) . because F m operates on i? m _i, £ m , and E m+ i. Hence, for the interface 
nodes m — l and m = M, the system of equations (I17al) requires the addition of 
the ghost nodes m = and m = M + 1, respectively. The value of the field at the 
ghost nodes will be determined by the boundary conditions, see Section [231 Note 
also that the notation E m needs to be interpreted differently in different expressions. 
Namely, in (fT3l) . (TT5T) . (fT6l) and similar formulae that introduce approximation of 
the individual terms in (PT21) . Em denotes the value of the exact continuous solution 
of © on the grid (111 a.b ■ In formula (|17al ), however, E m denotes the approximate 
discrete solution, which we calculate numerically. 

If the material coefficients v and e are continuous at z m , i.e., if z/ m -i = and 
Cm-i = e m 5 then ^-f exists at this point along with higher order derivatives. In that 
case, equation (|17a|) reduces to 



E m +i — 2.E m + E m -i i , 2 E m _i + 6E m + E m+ i 



h 2 



+ /c e r 



|-E m _i| £" m _i + 6 |E m | E m + l-Bm+lj -Em+i 



(17b) 



0. 



In scheme (|17b| ), the second derivative ^-f is approximated by the conventional 
second order central differences (PT4l) . but the non-differentiated terms are evaluated 
as weighted sums over three neighboring nodes rather than pointwise. 



2.3 Alternative second order approximation 



Instead of interpolating the cubic term \E\ 2 E as in Section [2T2l one can substitute 
the linear interpolation (fT51) into the corresponding integrals of (PT2l) . This approach 
is slightly more cumbersome. As we will see in Section l2~4l however, it will enable 
the construction of the fourth order compact discretization. 



11 



It is convenient to adopt a tensor notation. First, we recast formula (1151) as 

i 

E{z rn + hQ=Y / Fi(0 E m+i + O(h 2 ), where F = I - (, F 1 = 



i=0 



This representation, when substituted into the linear integral term of (PT21) . provides 
an equivalent alternative form of equation (PT6l) : 

f m+h Edz = hj2( P F t d() E m+l + O (h 3 ) =hj2 fiEm+i + O (it 

S v ' 

fi 

while its substitution into the cubic term \E\ 2 E = E*E 2 yields: 



Z ™+h I 77112 



E\ Edz 



h f' 2 (E^CO^Ch) (j:F J (0E m+ A (j2F k (C)E m+k ) dC, + 0(fv 

J0 \i=0 ) \j=0 J \k=0 / 



i,j,k=0 

9ijk 

1 

^ E 9ijkEm+iEm+jEm+k + O (h 
i,j,k=0 



The constants /, and in the previous formulae are defined as 

i i 
fi = Jj Fi d(, g ijk = Jj FiFjF k d(, i, j, k = 0, 1. 

Evaluation of these integrals yields: 

f = 3 f =i 

JO g ? J 1 g ! 

_ 15 11 5 1 

#000 - #001 - #011 - ^> 0111 - g4" 

Note that the tensor elements g^k are symmetric with respect to any permutation of 
the indices i, j, and k, e.g., g 011 = g wl = g no . 

Altogether, the integrals over [z m , z m+ i] in (|T2]) are approximated as 

f m+h (u m E + e m \E\ 2 E) dz = 

l l 

hv m ^2 fiEm+i + he m ^2 gijkE m+i E m+ jE m+ k + 0(h 3 ), 

i=0 i,j,k=0 
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and the integrals over [z m _i, z m ] are approximated the same way. Hence, the al- 
ternative second order discretization of the integral relation (PT21) can be written as 



, T-, /ttin def -E'm+l E m E m E m —\ 

nr m {tj) = 

h h 

1 1 

Vm—l ^ ] fiE m —i H~ h^o^m—l ^ ] 9ijkE m _^E m _jE m _f c 
i=0 i,j,k=0 
1 1 

+ hk n u m fjE m+ j + hk e m ^ gij^E^ n+i E m j r jE m j r k = 0, 

i=0 i,j,k=0 

(18a) 

where m = 1, . . . , M. Similarly to (I17al) . _E m in formula (I18al) should be interpreted 
as the approximate solution on the grid (111 ab . and its values at the ghost nodes 
m = and m = M + 1 are determined in Section 12.51 Again, if v and e are 
continuous and E is smooth at z m , then scheme (I18al) reduces to a central-difference 
second order scheme for the NLH (l9al): 



-Pm(E) — 7^ h klu m ^2 fi(E m -i + E m+i ) 

1 i=0 (18b) 

+ &o e «i fl'ijfe [Em-iEm—j E m -k + E^+iEm+j E m+ kj = 0. 

i,j,k=0 

Note that the linear terms in (I18al) and (I18bl) are identical to those in (I17al) 
and (|17bk respectively, they are merely expressed in a different form. 



2.4 Equation-based fourth order approximation 



In this section, we build a compact fourth order discretization for the integral rela- 
tion (fT2l) . The general idea of all compact schemes is to use the original differential 
equation to obtain the higher order derivatives that could help cancel the leading 
terms of the truncation error and thus improve the order of accuracy. This idea has 
been implemented, e.g., by Singer and Turkel in [16] for a finite-difference ap- 
proximation of the linear Helmholtz equation. Hereafter, we adopt some elements 
of their equation-based approach. As we shall see though, some additional com- 
plications arise when this approach is applied to the approximation of the integral 
relation (fT2l) . which, in particular, involves nonlinearity. 

The differential equation (|9a|) inside the grid cells can be used to evaluate the one- 
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sided second derivatives at the grid nodes as follows: 

d 2 E 



TP" — f 

m+ ~ dz 2 



TP" 



dcf 



d 2 E 



(m+1) " " dz 2 



(19a) 
(19b) 



z=z m +i- 



Subsequently, formulae <fT9l will be used to approximate each of the five terms on 
the left-hand side of (PT21) with fourth order accuracy. 

To approximate the fluxes E' i in (f!2l) . we first use the Taylor expansion: 



E m +i — E m h 2 ^(3) 



h 



—E w Ml +0(h 4 ). 
24 m+ 5 



Then, we approximate the third derivative E m+1 _ with second order accuracy and 
use (fl~9~l ), which yields: 



m +2 V 

= h 

Finally, we introduce the dimensionless grid size 

h = k h 

and obtain: 



Oh 2 ). 



E'.i 



m+i 



l(h 2 ( 2 v\ 

^ I 1 + — \ Vm + Cm l-Bm+ll J I E m+ i 

1 / fl 2 \ 

- ^{l + — (u m + e m \E m \ 2 )\E m + 0(h 4 ). 



We repeat the calculation for E' _i. Altogether, the flux difference, i.e., the first 
term in (fT2~l) . is approximated as 



2 x 1 



2 — -En 



/) 2 \ P P 
/t \ J-J-ni - C/ m— 1 



2 1 



+ e r 



^m— 1 



h 2 \E m+ i\ E m+ i — \E m \ E m 

1,2 Ip |2 n I rp |2 r-i 

'* - C/ m - C/ m - C/ m— 1 - C/ m— 1 



1 + 24 



24 



h 4 . 
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Next, we approximate the four integral terms in (fl2l) . To do that, we build fourth 
order polynomial approximations of the integrands. The following lemma is instru- 
mental for this purpose. 

Lemma 1 Let E £ C 4 ([z m , z m+ i\). Let the values E m = E(z m ) and E m+ i = 
E( 

z m.+i) be known along with the values of the one-sided second derivatives E'^ 
and E'/ m+1 \. Then, the function E(z) is approximated with fourth order accuracy: 

E{z m + Ch) = P 3 (C) + O (V ) , z £ [zm, z m +i} , (20a) 
by the Hermite-Birkhojf cubic polynomial: 



Ps(0 = (E m - ^El\ (1 - C) + (1 - C) 3 

/ h 2 „ \ h 2 „ 3 

+ ( E m+1 - —E" m+1] _ J C + y J E(' m+ i ) _C 3 - 



(20b) 



Moreover, given E m , E m+ i, E'^ + , and E'/ m+1 ^_, the polynomial A20b\) is unique. 



PROOF. See Appendix O 



Note that, in general, for the construction of P 3 on a given individual interval 
[z m , z m+ i], it is unimportant that the derivatives in formula (I20bl) are one-sided. 
We only use one-sided derivatives in order to be able to use the result in the con- 
text of discrete approximation on the entire grid, when the material coefficients 
and hence second derivatives of the solution can undergo jumps at the grid nodes. 
We also note that the cubic polynomials built in accordance with Lemma [Dare not 
equivalent to the standard cubic splines, see Section [67T1 for more detail. 

Substituting expressions (fT9l) into formula (I20bl) . we obtain a fourth order approxi- 
mation of E(z) on [z m , z m+ i\: 

E{z m + (h)= (l + y (v m + e m \E m f)^ E m (1 - C) 
h 2 

— {ym + e m l-EVnJ 2 ) E m (1 — () 3 

+ + — [y m + e m l-^m+il 2 )^ E m+ iC 

6 

For convenience, let us rewrite the previous expression as 

3 

E{z m + (h) = J2 P(C; h, vm)vt + O (h 4 ) , (21) 

i=0 



I -Em+1 1 ) E m+ i C 3 + 0(h 
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where 



F (C; h, u) = (l- C) f l + i/y (l - (l - C) 2 )) , F 2 (C; h, v) = C (i + ^ (l - C 2 ) 

ri 

6 



F X (C; = ^ (1 - C) (i - (i - C) 2 ) , ^(C; h, v) = (i - C 2 ) 



and 

^o" = E m , = e m |.B m | E m , — E m +i, v% = e m |_Z? m _|_i| E m +i. 



Then, substituting expression (l2Tj) for -E(z) into the last two integral terms of (fT2~)) 
and evaluating the integrals with respect to £, we have: 

f Edz = h"t(r F((; v m , h) d() vf + O (h 5 ) = hj^fivf + 0(h 5 ) 

Jzm i=0 V° I i=0 
v v ' 

fi 

f m+h \E\ 2 Edz =h If' F i F j F k d()(v?)*v+v+ + O (h 5 ) 

Jzm i,j,k=0 V / 

= * E 9vk-(vtTv+vt + 0(h s ). 

i,j,k=0 

The constants /* and g^ in the previous formulae are defined as 



fi(u, h) = / 2 F,(C; i/, fc) dC, 9ijk{v, h)= I FiFjFk d(, 
Jo Jo 

h 3i k = 0, • • • , 3, 



(22) 



and their values are given in Table [Tf^l As in the case of the second order scheme 
(Section 1231) . it is clear from the definition of the tensor elements g^, formula (1221) . 
that they are symmetric with respect to any permutation of the indices {i, j, k}. 

Evaluation of the integrals of (fT2~l) for the interval [z m _i, z m ] is nearly identical; it 
only requires replacing (u m , e m ) with (v m ~i, e m -i) and vf with vj , where 

^0 = F-mj = Cm— 1 l-E'ml F m , f 2 = E m — \, = 6 m — i E m _\. 



Finally, by combining the approximations for all the individual terms in (fT2~)) we 



6 A direct computation of all the tensor elements in (1221 ) could be quite tedious and prone 
to errors. This task, however, can be efficiently automated, see Appendix iDl 
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coefficients 


explicit expression 


/oO) 




hiy) 


Kir 


h{y) 


V V 4 / 7 


J3 \ u ) 


JL(hV 

24 \ 4J 


9ooo {v) 


15 , 9 „ f/^ 2 , 21 .^c^N 4 , 3 ..S^N 6 


9ooi W 


16 + T6 U+ 10 17 


9on (y) 


32 +10 " 


9m M 


10 V 4 J 


9002 iy) 


11 , 41 ^n 2 , 1949 ,,2(h\ 4 , 2791 ^//n 6 
192 144 ^ V 4 / 4320 U / 11340 V 4 ^ 


9012 (^) 


53 (h\ 2 , 845 (ft^ 4 ,, 1 2791 (h\^ 7 .2 
720 V 4 / " r 3024 U / ^ 11340 U / ^ 


9003 I 17 ) 


llfft\ 2 , 577 ffc\ 4 , y 1 2791 M^,.2 
80 U/ 1680 ^ "•" 11340 V4^ u 


5013 iy) 


577 (M 4 , 2791 (h\ 6 
3360 U/ "r 11340 \ 4' u 


5112 (^) 


3257 f/i^ 4 | 2791 /ii\ 6 
30240 W 1 11340 \i> 


ffii3 iy) 


2791 //i\ 6 
11340 V4/ 


9022 iy) 


5 1 23 / 1 1379 j, 2/ ^\ 4 1 2329 „3f^ 6 
192 "1" 144 ^ U I "•" 4320 ^ U / "•" 11340 " \ 4 / 


9122 iy) 


29 fh\ 2 | 2743 (h^ , 2329 fh\ e ,,2 
720 U/ " r 15120 \ 4> ^ 1 11340 ^ 


9023 W 


43 (h\ 2 , 691 (fc\ 4 ,. 1 2329 fM 6 ..2 
720 U / " r 3024 U / " " r 11340 U ) u 


9123 l^J 


2743 (h\ A | 2329 //i\ 6 ,, 
30240 \4> 1 11340 ^ 


5033 (z 7 ) 


463 (h\ A , 2329 //i^ 6 
3360 " r 11340 W 


9133 l^J 


2329 /fe\ 6 
11340 V4/ 


C/222 I,' 7 J 


ill,, (h\ 2 , _67_ , 47 ,,3C^^ 
64 + 48 ^ U J + 288 ^ U -> + 270 ^ U J 


9223 M 


.A. (iy 2 + JL (■ k\ v + JL (i) V 

144 V 4 / ~ 432 v 4 / ^ 270 v 4 / ^ 


9233 H 


JL f^) 4 _|_ JL (hL^y 
864 V 4 / ~ 270 V 4 / ^ 


9333 0) 


JL 

270 U/ 



Table 1 

Coefficients (l22l) of the fourth-order compact approximation (I23ab . Only 20 coefficients 
are given out of o total of 64, because gijk are symmetric with respect to the permutations 
of indices, i.e. # io = 9ooi, 9310 = 9oi3, etc. 



obtain the following fourth order scheme: 

def E m+ i — E n 



hF m (E) 



1 



v, 



h 2 kl 



Em — E, 



m— 1 
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1 + V. 



m—1" 



h 2 ^ 

24 
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h 2 k^ l-Em+il E m+ i — \E m \ E m 
+ €m ~A h 

h kn \E m \ E m L£/ m _i E m _i 
-e^i-U 1 (23a) 

3 3 

+ hk\v m _ x fi(v m -i)v~ + hkle m -i ^ g ijk {v m -i){Vi)*vJv^ 

i=0 i,j,k=0 
3 3 

+ hk\v m ^ fi{vm)vf + hk\e m ^ gijk(vm)(v?)' *Vj ~v£ = 0, 

i=0 i,j,k=0 

where m = 1, . . . , M. As in the case of second order approximations, the value of 
the field on the ghost nodes m . = and m = M + 1 will be determined from the 
boundary conditions, see Section [231 

Similarly to the second order cases (Sections |2.2| and |231) , if v and e are continuous 
at a given node z m , then E is smooth at this location and the scheme (|23ab reduces 
to the following fourth order scheme for the differential equation (l9a|) : 



r-, /,-,>. E m+ i — 2E m + E m _i I h 2 k 2 
r m [hj) = — 1 H — ——v„ 



24 



2 2 2 

H — (J-Em+il ^m+i ~~ 2|£ , m | i? m + E m -i 

3 (23b) 

i=0 
3 

i,j,k=0 

Note that in the simplest case of a linear equation with constant coefficients, e m = 
and u m = u = const, scheme (I23bl) transforms into 

E m -\ — 2E rn + E m+ i 2 E m —i + 4£/ m + E m+ i 

* V 6 (24) 

t.2i 4 2'Em-l + + 7E m+ i 

+ h knv = 0. 

384 

It can be verified that the scheme (1241) is equivalent (up to terms of order 0(h A ) and 
higher) to the standard three-point fourth order compact approximation 

E m -\ — 2E m + E m+ i 2 E m _i + 10i? m + E m+ i 

h* + KV 12 = °- (25) 

of the linear constant coefficient Helmholtz equation [16]. 
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2.5 Two-way boundary conditions 

We now derive the discrete version of the two-way boundary conditions (|9bl) at the 
interface z = and z = Z m£kX . Recall that the two-way boundary condition (1751 ) was 
constructed in Section Q] so as to facilitate the propagation of the outgoing waves 
through the interface z = and at the same time to prescribe the given incoming 
signal. This means that the solution to equation (l9~al) for z < is to be composed of 
a given incoming wave and the outgoing wave, which is not known ahead of time. 
Since for z < the material is a homogeneous linear dielectric with v = 1 and 
e = 0, we have: 

E{z) = E? nc e ikoz + Re~ ikoz , z < 0, (26) 
and the boundary condition is derived from the continuity of E and E' at z = 0. 

Our approach to constructing the discrete boundary condition for the scheme is 
to approximate (1261 ) using closed form solutions of the corresponding difference 
equation. This will provide the value of the solution at the ghost node E in terms 
of that at the boundary node E 1 and the incoming beam E® nc . Then, E can be elim- 
inated from the equation i*\[E] = 0. A survey of methods for setting the boundary 
conditions at external artificial boundaries can be found in [23]. In the context of 
the one-dimensional NLH, the continuous two-way boundary conditions are dis- 
cussed in [4]. For the multidimensional NLH, the continuous and discrete two-way 
boundary conditions are constructed and implemented in [11-13]. 

Since v m = 1 and e m = for m = 0, —1, . . . (i.e., for z < 0), both the second 
order approximation and the fourth order approximation of Section [2] reduce to a 
symmetric constant-coefficient three-point discretization of the form: 

= E m (V)=L 1 E m _ 1 -2L E m + L 1 E m+1 , m = 0, -1, . . . , (27) 

where the coefficients L and L\ are different for each specific approximation. For 
the second order discretizations (117a| ) and (1 1 8ab we have: 

L = hr 2 — , Li = hr 2 + -, 
o 8 8 

while for the fourth order discretization (123a|) we have: 

L = h~ 2 - - - —h 2 , L 1 = h- 2 + - + —h 2 . 

3 128 6 384 

The general solution of the difference equation (1271) is C\q m + C2q~ m , where 

q = Lq/Lx + i^l-iLv/Lxf and q~ x = L /Lx - i^l-{L Q /Lxf (28) 

are roots of the characteristic equation L x q 2 — 2L q + Lx = 0. As can be easily 
seen from (1281) . \q\ = 1 and q^ 1 = q*. Moreover, one can show that the solution 
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q m approximates the right-traveling wave e lkoz = e lfe °' l ( m_1 ), and the solution q~ m 
approximates the left-traveling wave e~ lk ° z = e - lk oh{m-\) ^ w ^ respective orders 
of accuracy (second or fourth), see [1 1-13] for more detail. 

Consequently, the discrete counterpart of formula (1261) for m < 1 can be written as 

E m = E? nc q m ~ l + Rq 1 '™, m = 1, 0, -1, . . . . (29) 

From equation (1291) considered for m = 1 and for m . = we can express the value 
of the solution at the ghost node E as 

E = (q- 1 -q)E? nc + qE 1 . (30a) 

The discrete version of the two-way boundary condition (l9bl) at z = is then 
obtained by substituting E from (I30al) into the discrete equation Fi[E] = 0, i.e., 
into equation (I17al) . (I18al) or (123 al) with m = 1. 

Similarly, the discrete version of the Sommerfeld boundary condition (l9b~l) at z = 

£?m+i = ?£m- (30b) 
This relation is substituted into the discrete equation Fa/[E] = 0. 



3 Newton's iterations 



The discrete approximations (I17al) . (I18al) and (I23al) are coupled systems of nonlin- 
ear algebraic equations. In our previous work [11-13], we solved similar systems 
by an iteration scheme based on freezing the nonlinearity \E\ 2 in equation (OQ). In 
doing so, we have observed that the convergence of iterations was limited to rela- 
tively low-power incoming beams, i.e., weak nonlinearities. 

In this section, we describe a different iteration scheme for solving the NLH (|9al ) 
based on Newton's method. Newton's method cannot be applied to equation (|9al ) 
directly, because \E\is not differentiable in the Cauchy-Riemann sense and hence 
the entire operator is not differentiable in the sense of Frechet. This difficulty and 
the way to overcome it are first discussed in Section [3TT1 through the consideration 
of Newton's method for a single-variable complex function. The method is then 
extended to multivariable functions in Section 13.21 and its application to the dis- 
cretizations of Section[2]is considered in Section l3~3l Note that the particular imple- 
mentation of Newton's method presented hereafter leads to a convenient block tridi- 
agonal structure of the Jacobians that enables efficient numerical inversion (O(M) 
time). 
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3.1 A single complex variable 



The basic idea is, in fact, quite simple — while the function \E\ 2 is not differen- 
tiable with respect to E, it is differentiable with respect to Re(E) and lm(E) as 
a function of two real variables. Hence, Newton's linearization can be obtained if 
one complex equation is recast as a system of two real equations. Let us first recall 
Newton's method for solving the scalar equation 

= F(E), 

where F is differentiable with respect to E. We denote the exact solution by E, 
the j-th iterate by E^\ and their difference by 5E = E — E^>. Using the Taylor 
expansion around E^> we have: 

rIF 

= F(E) = F{E {j) + SE) = F{E {j) ) + ^— 

dE 

Introducing the differential of F at E^\ 



SE + 0[\SE\ 2 ) . 

E=EU) 



dF 

5F = J(E U) )5E, where J(E) = — , (31) 

dE 

we can then write: 

5F = F(E® + 5E) - F(E®) + O {\5E\ 2 ) = -F(E®) + O {\5E\ 2 ) , 
and consequently, 

J(E®)6E = 5F = -F(E®) + O (\5E\ 2 ) . (32) 

Neglecting the O (\5E\ 2 ) term in (|32l) and solving the equation with respect to SE 
we obtain the next iterate E^ +V) = E^ + 5E: 

E U+i) = E U) _ ^J(E^)] _1 F(E^). (33) 

If the initial guess E^ is chosen sufficiently close to E, then the sequence of 
Newton's iterations (|33l is known to converge to the exact solution E as j ; — > oo. 

Next, we consider the scalar equation 

= F(E) = \E\ 2 E - 1 = E*E 2 - 1. (34) 



The modulus \E\ in (1341) is not differentiable with respect to E in the Cauchy- 
Riemann sense. However, F is differentiable as a function of two variables Re(E) 
and lm(E) or, alternatively, E and E*. Hence, 

= F(E) = F(E U) + SE) = F(E (J) ) + (E U) ) 2 SE* + 2\E {j) \ 2 5E + O (\SE\ 2 ) . 
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Therefore, since |^ = 2\E\ 2 and = E 2 , the analogue of (13TT) is 



<5F = J\5E + J 2 5£*, where J\{E) = || , J 2 (£) = J^. (35) 



Consequently, the equivalent of (1321) is 

J!(EW)6E + J 2 {E {j) )5E* = 5E = -F{E {j) ) + O (|<5£| 2 ) . (36) 



To solve equation (1361 ) for and obtain the equivalent of (|33l) , we separate the 
real and imaginary parts of the function F and the independent variable E. This is 
convenient to do by representing them as real 2x1 column vectors: 



E^ E 



Re(£) 
lm{E) 



F^ F 



Re(F) 
Im(F) 



Then, multiplication by a complex number and conjugation correspond to matrix 
operations on IR 2 , which leads to a real Jacobian in ( I33T ). Indeed, multiplication by 
a complex number c can be represented as 



C'ZH c ■ z 



Re(c • z) 




Im(c • z) 





Re(c) -Im(c) 
Im(c) Refc) 



Re(z) 
Im(z) 



If we associate a 2 x 2 real matrix c with a given complex number c: 



Re (c) -Im (c) 
Imfc) Refc) 



Re(c) 



1 
1 



Im(c) 



-1 

1 



then 



c • Z I — ► c • z. 



Similarly, complex conjugation is a left multiplication by the matrix diag[l, — 1]: 



z i — ^ z 



1 




Re(» 




1 


-1 




Im(z) 




-1 
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Thus, equation (1351) transforms into 



5F 



2\E^\ 2 





2\E^\ 2 


Ji 






1 








) 




-1 





Ro{SE) 
1m(SE) 



Re(E^) 2 -lm(E^) 2 
Jm(E^) 2 Re(E^) 2 



1 
-1 



J 2 



5E d ^ ?5E, 



Re(SE) 
Im(5E) 

(37) 



where 







1 




J = 


Ji + X 




) 




{ 


-1 





Having derived the real Jacobian J , we neglect the quadratic terms in (1361 ) to obtain 
the following Newton's iteration: 



J;(j+!) _ J;G0 = _ 



1{E^) 



F(E {j) ) 



3.2 Extension to multiple variables 



We now apply the procedure outlined in Section |3~TTl to a system of the form F(E) = 
0, where E = [E±, E M f G C M and F = [F±, F M f e C M . We would like 
to solve the equations using Newton's iterations of the type: 

E (i+i) _ E (i) = _ [j(E (i) )]~ 1 F(E ( - ? ' ) ), 

where J(E) is the appropriate Jacobian of F(E). As, however, the individual com- 
ponents of the vector F are not differentiable in the Cauchy-Riemann sense with 
respect to the components of E, the Frechet differential of F(E) and the corre- 
sponding Jacobian can only be introduced as in Section [3TT1 by recasting the equa- 
tion using the real and imaginary parts of all variables. 

As in Section [3TTT the variation of F(E) in terms of the field E and its conjugate E* 
is given by 

dF dF 
5F(E) = Ji<5E + J 2 5E*, where J x = — and J 2 



dE dE* 
Let us represent E and F as 2M x 1 column vectors with real components: 



23 



E 
F 



Re(^i), ha(Ei), . . . Re(£ m ), Im(E m ), . . . Rc(E M ), lm(E 



M 



Re(Fi), Im(^i). • • • Re(F m ), Im(F m ), . . . Re(F M ), Im(F, 



T 



To obtain the real Jacobian J, we will represent the complex matrices J\ and J 2 as 
real matrices of dimension 2M x 2M. Let A be a complex M x M matrix. For 

each entry A; m , we substitute the 2x2 real block A\ m : 



Ah A 



A 



ii 



.4 



1M 



A 



Ml 



A 



MM 



Re(An) -Im(A n ) 
Im(An) Re(An) 

Re(A M1 ) -Im(A M1 ; 
Im(A M1 ) Re(A M1 ) 



Re(Ai M ) -Im(A 1M ) 
Im(AiM) Re(A 1M ) 

Re(A MM ) -Im(A MM ) 
Im(A MM ) Re(A MM ) 



Introducing the matrix direct product A <g> 5 as the matrix obtained by replacing 
each entry A[ m of A by the block A irn ■ B, we can write: 



A = Re (A) 



1 
1 



ha(A) 



-1 

1 



Similarly, the conjugation of a column vector E can be represented as 

1 
-1 



E* 





/ 


1 






1 


E* = 


hi® 


-1 


> 




-1 



1 
-1 



2MX2M 



where I M is the M x M identity matrix. 
Then, the differential of the real function F is: 

5F(E) = Ji5E + T26W = 75% 
where the real Jacobian is given by the 2M x 2M matrix 





( 


1 




J = Jl + J2 ■ 


Im® 




) 




\ 


1 
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3.3 Differentiation o/F(E) with respect to E and E* 



In this section, we discuss the actual differentiation of F(E), i.e., the evaluation of 
J\ — |g and J 2 = Jpr. As we shall see, the tensor notation of Sections |23l and |2~4l 
prove extremely useful in this context. 

Using the identities 

dE* dEi dEi dE* . f 0, % ^ k 



8E k dE* k dE k 3E* k [1, i = k 

we first differentiate the linear terms in F(E) and obtain for q — — 1, 0, 1: 

d 

\L\E m -x — 2L E m + LxE m+ i) = 0, 



9E^ +q 



d 

— {LiE m _i — 2L E m + LiE m+ i) = Li6-i ;Q — 2Lq5q s + Li8\^ 



dE m+q 



where the notation L , L\ for the coefficients of the scheme was introduced in 
Section l231 The nonlinear terms of the second order scheme (I17al) are differentiated 

as 

d d 

I rp 1 2 rp O I 7? 1^ I J7 1 1^ J7 1 TT^ 

\ J - J m\ J - J m ^ |- C/ m| ; r, r \ J -'m\ - C/ m' 



dE m 1 1 ' dE\„ 

and similarly for \E m ±i\ 2 E m ±\. 

For the alternative second order scheme (II 8ab we have: 

d 



^ ] QijkE m ^E m j t .jE rn j rk ^ ] Qijk^iqEm+j E m j rk ^ ] QqjkE m +jE m -\- k 



17 m+g jj,fc=0 ij',fc=0 j,k=0 

and, using the symmetry of g^ k : 

9 1 1 1 

^ X! 9ijkE* m+i E m +jE m ^ k = 2J 9iqkE* l+i E m+k + ^ g ijq E^ +i E m+j 

UCj m+q i : j tk =o i :k= o jJ=o 

1 

= 2 J]] 9ijqEm+iErn+j ■ 
i,j=0 



Similarly, the nonlinear terms of the fourth order scheme (I23al) are differentiated as 

d 3 3 <% + 9 3 3 dvt 



UJ-im+q i=0 j =0 U^m+q V^m+q j=o i=0 ^-^m+g 
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and 







dE. 



m+q ij,k=0 

d 



i,j,k=0 
3 



m+q 



dv + 

3 dE, 



urj m+q ij,fc=0 



E 9ijk-{v?)*vfv% 



E k + «j dEl 



i,j,k=0 



m+q 



m+q t 
J 3Em+q, 



3.4 Choice of the initial guess 



Convergence of Newton's iterations is known to be sensitive to how close the ini- 
tial guess E(°) happens to be to the solution E. Hereafter, we use two different 
strategies for choosing the initial guesses. 

When we test the convergence of Newton's iterations in the vicinity of the exact 
solution (of the discrete system of equations), we use for the initial guess the closed 
form solution of the continuous problem © of Chen and Mills [4,5]. Indeed, we 
can expect this solution to be either O (h 2 ) or O (/i 4 ) close to the exact solution of 
the discrete system of equations F(E) = 0. This expectation, which is based on 
the accuracy analysis of Sections [2721 12.3L and 12.41 and does not involve a stability 
proof, is later corroborated experimentally (see Section [5]). 

A more interesting case, of course, is when the solution is not known ahead of 
time0 In order to choose the initial guess in this case, we adopt a continuation 
approach in the nonlinearity parameter e. Namely, we increase e in a series of in- 
crements: 

ei < e 2 < . . . < e n , 

where at the j-th stage we apply Newton's method to the NLH (|9al ) with e = ej 
using the solution from the j-1 stage with ej_i as the initial guess. In doing so, the 
value of e„ is the target nonlinearity parameter for a given computation, and it can 
be large. At the beginning stage j = 0, the initial guess may be chosen either as the 
solution of the linear problem with e = 0, or as the solution obtained by iteration 
schemes based on freezing the nonlinearity as in [11-15], which converge for weak 
nonlinearities. 



4 Summary of the numerical method 



An integral formulation of the NLH (|9aT ) is discretized on the grid (II 1 ab and written 
in the form F(E) = 0. The operator F(E) is given by (|17a|) , (|18aD or (|23ab at the 
interior nodes m = 1, . . . , M, while the ghost nodes E and Em+i are specified 



This would be the case, in particular, for the multidimensional NLH. 
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by (130al) and (I30bl) . respectively. The resulting system of nonlinear equations is 
linearized: 



SF(E) = Jt5E + J 2 5E*, 



Ji 



dF 

dE'' 



Jo 



dF 

dE* 



where J\ and J 2 are calculated in Section 13 .31 An equivalent linearized form is 
obtained using the M. 2M representation of Section I3T21 









1 




5F = 7SE, 


7 = T 1 + T 2 - 




-1 


) 



Subsequently, the real Jacobian J is used to build the sequence of Newton's itera- 
tions: 



E^' +1 ) - E (i) 



F(E (J) ) 



As we shall see, it is sometimes useful to use a relaxation scheme: 



E 0'+i) _ gtf) = - 



10 



j(E^) F(E (J) ). 



(38) 



where the relaxation parameter is typically chosen in the range 0.1 < uj < 0.3. The 
value oj — \ reduces (|38l ) back to the original Newton's method. 

The initial guess is taken as the closed form continuous solution when the 
convergence of Newton's iterations is first studied. In general, continuation by the 
nonlinearity parameter e is used to compute the solutions for strong nonlinearities. 

The last important component of the overall numerical method is the inversion of 
the Jacobian. As the problem we are currently solving is one-dimensional, we are 



using a direct sparse solver to evaluate 



i -i 



for every j = 0, 1, 2, . . .. For 



J(E^) 

all our schemes, both second order and fourth order accurate, the Jacobian has three 
non-zero diagonals composed of 2 x 2 blocks, which enables an efficient solution 
by a sparse method. 



5 Numerical results 



In this section, we experimentally assess the computational error and the conver- 
gence of iterations for the new method, by comparing it with other methods that we 
have used previously to solve the NLH [11-13]. Since in the one-dimensional case 
the closed form solutions are available explicitly, the numerical error is evaluated 
directly. All computations in this section are conducted with double precision. 
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5. 1 Reference methods 



In Section [5.2. 11 we compare convergence of Newton's iterations with that of the 
iterations based on freezing the nonlinearity: 

, „ + u(z)E^ + e(z)\E^\ 2 E^ = 0. (39) 
dz 2 

For every j = 0, 1, 2, . . ., the next iterate E^ +1 > is obtained by solving the linear, 
variable coefficients, differential equation (|39l) . This "freezing" approach was used 
in our earlier work [11-13] and also by Suryanto et al. [14, 15]. In addition, we use 
a relaxation method based on (|39|): 



E (j+i) = ( X _ ^ E U) + ujE^ 2 \ (40) 
where E^ +l/2) in (gDJ) is the solution of (|3~9"1) . Note that is an analogue of (1381) . 

In Section 15.3.11 we evaluate the error of the compact schemes built in Section [2] 
We compare it with the error of the standard central-difference discretizations of 
order two: 

Em—1 ~ 2-EVrt + E m +i iil I j-i |2\ 771 n i ai\ 
^2 ^ ^Ol^m + e m|^m| )E m = (41) 

and of order four: 

— E m -2 + 16.E m _l — 30-Em + lG-Em+i — -E m +2 ,2/ I 771 |2\ 771 _ n 
jql2 ^ "'0V I/ m + e m|-C'm| j-C'm — 

(42) 

In all the simulations, we made sure that the iterations' convergence was sufficient 
to enable a robust evaluation of the discretization error, i.e., to distinguish between 
the error of the difference scheme and the error due to the possible "underconver- 
gence" of our iterations. Furthermore, we verified that the new iterative method 
(Newton's) and the freezing iterative method (l39l) , when they both converge, pro- 
vide the same error (Section T5 .3 .11) . 



5.2 Convergence 



In this section, we discuss convergence of Newton's method and compare it with 
that of the nonlinear iterations (|39l) and (|40l ). The parameters used are v = 1, 
k = 8, Z max = 10, and a uniform nonlinearity profile e(z) = const. Note that 
v = 1 corresponds to the case of the linear index of refraction being the same 
inside and outside the Kerr medium. For these parameters, the first nonuniqueness 
region occurs around e = e c w 0.72 (see Figure [2]) • 
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5.2.1 Local convergence 



The goal of the first series of compu- 
tations is to determine how the magni- 
tude of nonlinearity (i.e., the value of e) 
affects the convergence of Newton's it- 
erations relative to that of nonlinear it- 
erations (l39l) . To achieve this goal, we 
choose the initial guess to be the point- 
wise values of the continuous exact solu- 
tion on the grid, which, as noted, is close 
to the actual discrete solution for fine 
grids. To distinguish between the issues 
related to iterations' convergence and 
those pertinent to a specific discretiza- 
tion, we choose one particular scheme, 
the simplest second order scheme (|4T|) . 
for all the convergence experiments in 
this subsection. 

The nonlinear iteration scheme (l39~l) con- 
verges for e < 0.167 ~ 0.23e c . Its relax- 
ation analogue (l40l) with u — 0.1 allows 
us to increase the convergence range up 
to about 0.3 ~ 0.4e c . Decreasing the value of u does not seem to have a significant 
effect. For e above these thresholds the iterations diverge, and the divergence occurs 
also for other discretizations that we have used (Section[2]) rather than only for (|4T|) . 
Therefore, it shall be interpreted as a limitation of the iteration procedure itself. This 
divergence is not related to the onset of nonuniqueness in the NLH, because the 
convergence breaks down far below the nonlinearity threshold for uniqueness e c . 

In contradistinction to that, Newton's method convergence for e £ [0,0.9], ex- 
cept near the switchback points e = e c and e — e' c , where f^J = (see Fig- 
ure [3]). As expected, the convergence of Newton's iterations considerably slows 
down as |e — e c \ or |e — e' c \ becomes small, and eventually, close enough to a given 
switchback point, the iterations diverge (the Jacobians degenerate at the switchback 
points). Other than that, the method shows robust convergence. Specifically, the 
method converges when the solutions are non-unique: When the initial guess was 
close to one of the points B, C, or D in Figure [31 which correspond to e = 0.724 
inside the first region of nonuniqueness, the method converged to the respective 
discrete solution. This indicates that if the grid is sufficiently fine to tell between 
two close solutions inside the switchback (see Section l5.3.1l) . Newton's method has 
an adequate domain of convergence. Similar results were obtained for e = 0.834, 
which is in the middle of the second switchback region 0.828 < e < 0.839. 
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0.723 E c 0.724 0.725 



Figure 3. Local convergence experi- 
ments for Newton's iterations performed 
in the region of the first switchback 
of the one-dimensional NLH (|9ab (see 
Figure 13. Each of the initial guesses 
near A-E converged to the correspond- 
ing discrete solution. In addition, the 
continuation approach with A as the ini- 
tial guess and e = 0.724 converged to B, 
while continuation with E as the initial 
guess and e = 0.724 converged to D. 



Even at the highest nonlinearity we have tried, e = 3 ~ 4e c , Newton's iterations still 
converge. The value e = 3 is well beyond the first switchback. Moreover, it is an 
extremely high nonlinearity in two respects: First, for e = 3 there are 7 distinct, i.e., 
nonunique solutions (we tested the convergence to the highest power solution). Sec- 
ond, at this value the nonlinear response is so large that it would cause a breakdown 
and ionization of the actual physical material in an experimental setting, which ren- 
ders the original Kerr model inapplicable. We note that we did not observe in our 
simulations any convergence deterioration of Newton's method around e = 3 com- 
pared with smaller e's, it only requires 3 to 6 iterations to drive the residual down by 
10 orders of magnitude. We thus assume that most likely Newton's method would 
have converged for much higher values of e as well, although it has not been tried 
because of the physical irrelevance. 

As normally expected from Newton's method, the iterations converge rapidly, at 
a quadratic rate. In most of the cases that we studied in this section, it took 4-6 
iterations to reduce the original residual by 9 to 11 orders of magnitude. As has 
been mentioned, the only situation when Newton's convergence may noticeably 
slow down is for e near the switchback points e c and e' c , where the tangent to the 
curve T(e) becomes vertical, see Figured Convergence of the iteration scheme ( l39l 
which is based on freezing the nonlinearity, is much slower. It is estimated as linear 
based on experimental evidence, and on the fact that it can be interpreted as a fixed- 
point iteration. 

5.2.2 Continuation approach 

Having seen that Newton's method is locally convergent, we would like to test 
its performance for initial guesses that are not necessarily close to the solution. 
Our first observation is that when the solution to the linear problem is used as the 
initial guess, i.e., E^ = e lk ° z , then the iterations converge for < e < 0.08, and 
diverge for e > 0.08. Therefore, for larger values of e, we employ a continuation 
heuristics, increasing the nonlinearity in a series of increments (see Section [3~4l) . 
We emphasize that this version of the method uses no a priori knowledge of the 
solutions sought for. For the experiments in this section, we use the compact fourth 
order scheme (|23al) . 

In order to quantify the performance of the continuation heuristic, we tested, 
for initial values of e ; in [0,0.9], the ranges of allowable positive increments 
de = €{ — €i for which Newton's method would still converge[ll In other words, 
for each pair (e ; , de > 0) we applied Newton's method for the NLH with non- 
linearity 6f = €i + de, and with the initial guess E^ given by the solution with 
e = e ; . The results are displayed in Figures |4] and [51 Several observations can be 

8 We are primarily interested in the positive increments because our key objective for em- 
ploying the continuation strategy is to obtain suitable initial guesses for Newton's method 
when it is applied to high energy cases. 
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made. First, at higher nonlinearities the allowable increments are generally smaller, 
see Figure St A). Second, as can be seen by observing the adjacent transmittance 
graph in Figure[5£A), the allowable increments are highly correlated with the trans- 
mittance, which can be viewed as an indicator for the distance between the NLH 
solutions. Specifically, at the first and second switchback regions, see Figure OB), 
the allowable values of de demonstrate a rather irregular behavior. It is most im- 
portant however, that the allowable values of de do not decrease to zero, so that 
the continuation strategy for Newton's method can traverse through the first and 
second switchback regions. 




0.9 "0 




Figure 4. A) The allowable positive increments de = ef — e; for which the continuation 
method works, i.e., for which Newton's method converges; v = 1, ko = 8 and Z max = 10. 
The iterations were defined as converged if the residual decreased by a factor of 10 6 in 
20 iterations. B) Same as (A), for Newton's method with relaxation d38l ), with u = 0.3. 
The iterations were defined as converged if the residual decreased by a factor of 10 6 in 60 
iterations. 
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Figure 5. A) Same as (@K), plotted together with the transmittance T(e) (see Figure|2]). B) 
Same as (A), zooming on the first switchback region 



The previous tests were rerun with the relaxation version (1381) of Newton's method. 
The results are presented in Figure 0JB). One can see that the relaxation consider- 
ably increases the allowable increments, especially at the switchback regions. 

We next study the performance of Newton's method inside the nonuniqueness re- 
gion. When the initial guess was the solution slightly before the first switchback 
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on the T(e) curve, at point A in Figure [H and the value of e was chosen within 
the nonuniqueness region: e = 0.724, the method converged to the lower branch of 
the switchback, i.e., to the solution denoted by B in Figure [3j Similarly, selecting 
the initial guess slightly past the switchback, at point E in Figure [31 and again tak- 
ing e = 0.724 (negative increments are not shown in Figures [4] and [5]), facilitated 
convergence to the higher branch of the switchback curve, i.e., to the solution D in 
Figure [31 This behavior agrees with the standard notion of a hysteresis loop for a 
bistable device. 

It is also important to note that the continuation strategy for Newton's method can 
"hop over" (at least) the first and second switchback regions, which is an efficient 
way of reaching the regions of high nonlinearity. For example, a transition across 
the first switchback, from point A to point E in Figure [31 is possible by choosing 
the solution at point A as the initial guess for computing the solution with the value 
of e that corresponds to E. In this context we should mention that the first two 
nonuniqueness regions are rather narrow. For wider nonuniqueness regions that 
correspond to larger values of e, using a combination of the continuation in e and 
relaxation can be beneficial. Indeed, even though more iterations will be required 
for convergence with relaxation, larger allowable increments de — 6f — e x will help 
traverse those wider regions of nonuniqueness, see Figure[4tB). 

5.3 Computational error 

5.3.1 Homogeneous medium (discontinuities at the boundaries) 

In this section, we consider the case of a homogeneous Kerr medium (see formula 
©). Hence, the discontinuities are only at z = and z = Z max . The error of the 
solutions computed with the new schemes (|17a| ), (1 1 8ab . and (|23ak as well as the 
reference schemes (HTI) and (1421) . is reported in Tabled All computations are done 
using Newton's solver. The error is defined as the difference between the computed 
solution and the closed form Chen and Mills solution [4], and is evaluated in the 
maximum (Z^) norm. 

The discrete approximations of Section [2] are designed to retain their order of ac- 
curacy in the presence of material discontinuities. In order to test this, for each 
scheme we consider two cases, see Tabled The first case, v = 1.01 2 , e = 0.01, 
corresponds to a small discontinuity at the boundary and a weak nonlinearity. Note 
that the quantities ^/v — 1 = 0.01 and e characterize the difference between the 
linear and nonlinear indices of refraction inside and outside the medium, see Sec- 
tion ll.ll The second case, v = 1.3 2 , e = 0.845, corresponds to a large discontinuity 
at the boundary and an 0(1) nonlinearity (e/u = 0.5). 

In the first case, v = 1.01 2 , e = 0.01, the computations can also be repeated us- 
ing the original iterative solver (l39l) . because for this choice of parameters it still 
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Table 2 

Error for the 3 schemes of Section |2] and 2 reference schemes of Section 15.11 
Z max = 10 , ko = 8 . The entries are empty for cases wherein Newton's iteration diverged. 



converges. Having done that, we determined that the accuracy of the correspond- 
ing solution was the same as the accuracy of the solution obtained using Newton's 
method. This indicates that the errors presented in Tableware indeed the approxi- 
mation errors of the discrete schemes and should not be attributed to the solver. For 
the case with higher nonlinearity, e = 0.845, only Newton's iterations converged. 

The functional dependence of the error on the dimensionless grid resolution 
h = k h = fc °^y ax is shown in the rightmost column of Table [2] It is obtained by 
a weighted least squares fit. Considering the reference methods of Section |5~T1 the 
three-point central-difference approximation (14TT) displays a second order conver- 
gence. The five-point central-difference approximation (1421) . however, displays a 
fourth order convergence for (relatively) low grid resolutions and small discontinu- 
ities. For high grid resolutions and large discontinuities, however, its accuracy dete- 
riorates and shows a second order convergence. The limited ability of the reference 
methods to handle discontinuities is also reflected by the fact that the actual errors, 
and hence the coefficients in front of h 2 and h A increase substantially for larger 
discontinuities. As mentioned in Section [TT1 the five-node discretization (l42j) is 
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particularly sensitive to the presence of discontinuities. 



On the other hand, the errors of the new second order discretizations <|17ab 
and (I18al) . as well as that of the fourth order scheme (I23al) . are hardly affected 
by the increase of the discontinuity. Indeed, for larger discontinuities at the bound- 
ary, the improvement over (I4TT) ranges from a factor of 4 for (I18al) to a factor of 10 
for (I17al) . The improvement of (123 al) over (142)) is even more substantial. 

We also see that (I17al) yields better accuracy (smaller errors) than (I18al) . Indeed, 
intuitively one can expect that the integration of the interpolation of \E\ 2 E will 
approximate / \E\ 2 Edz better than the integration of the interpolation of E cubed. 
However, as we do not have d ^ E J z E ^ or d ^ E \ the discretization (I17al) cannot be 
extended to fourth order accuracy. There, perhaps, could be other approaches, such 
as the interpolation of the amplitude and phase of E. They, however, do not provide 
an obvious venue to the fourth order either. 

Regarding the new fourth order discretization (|23at , we can see from Table[2]that it 
is indeed fourth order accurate for both small and large material discontinuities. A 
minor increase of the error for larger e can be observed, which is natural to expect 
for solutions with sharper variations. Altogether, for the cases reported in Table |2] 
scheme (I23al) has proven up to 6000 times more accurate than the standard five- 
node central-difference scheme (1421) . 



A v=1.01 2 , 5-ptFD 

° v=1.01 2 , newFV 

v v=1.3 2 , 5-ptFD 

n v=1.3 2 , newFV 

1U .2 -1 

10 10 10 

normalized grid size h*k Q 

Figure 6. Computational error as a function of the grid size for schemes (I23ab [labeled "new 
FV"] and gg) [labeled "5-pt FD"]. 

To provide a more descriptive and more intuitive account of our grid convergence 
results, we present a log-log plot of the error as it depends on the grid size for 
our fourth order schemes, see Figure [6J The data used for Figure |6] are the same 
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as those in Table [2l Once can clearly see that the accuracy of the original fourth 
order scheme of [1 1, 12] deteriorates on fine meshes, whereas the new scheme (123al) 
maintains its fourth order. 

Finally, we compare the minimum grid resolutions required by different schemes 
(second and fourth order) to distinguish between the solutions inside the region 
of nonuniqueness (points B, C, and D in Figure [3]) and thus enable convergence 
of Newton's iterations. As could be expected, the fourth order scheme used for 
computations of Section |5.2.2| took roughly ten times fewer points per wavelength 
than the second order scheme used for computations of Section |5~.2. II 



5.3.2 Layered medium 

Here, we apply Newton's method along with the fourth order scheme (I23al) to solve 
the NLH for a piecewise-constant material. The configuration is that of a two-layer 
Kerr slab: 

jl.21 26 [0,5) (01210 2 6 [0,5) 

[1.69, ze(5,10] [0.5070, z e (5, 10] 

The material coefficients are therefore discontinuous at z = 5, as well as at the 
boundaries z = and z = 10. The value of the linear wavenumber is k = 8. 

The computed solution is compared with the closed form solutions obtained by 
Chen and Mills in [5]. The results are given in Table [31 they corroborate the de- 
signed fourth order accuracy of the method. 
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Table 3 



Error for the two-layered configuration (|43T ). The finite-volume O (/j 4 ) discretization (I23ab 
was used in combination with Newton's method for ko = 8. 



5.4 Computational efficiency 



Having addressed the issues of convergence and accuracy, we would also like to 
comment on the numerical efficiency of our method. 
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Grid dimension M 


10 2 


1Q 2.5 


10 3 


1Q 3.5 


10 4 


CPU time (sec) 


0.105 


0.333 


1.06 


3.39 


11.1 



Table 4 

Mean CPU times for a single Newton's iteration of the finite volume scheme (I23ab on AMD 
Athlon64 at 2200MHz in Matlab 7.3.0 under Linux. 

The CPU times for one Newton's iteration summarized in Table 0] clearly indicate 
that the complexity scales linearly as a function of the grid dimension. Moreover, 
as the number of Newton's iterations required to obtain the solution typically does 
not depend on the grid dimension (see Section 15.21) . we can say that the overall 
complexity of the proposed method also depends linearly on the grid. Of course, 
it is natural to expect that the methods based on shooting [4,5,7-10] will perform 
faster for a one-dimensional problem than our method that involves a full fledged 
approximation of the boundary value problem for the NLH. The shooting-based 
methods, however, will not generalize to multiple space dimensions. 



6 Discussion and future work 

6. 1 Discussion 

In this study, we approximated the one-dimensional NLH using new compact finite 
volume schemes of orders four and two (the latter predominantly for reference pur- 
poses). The fourth order approximation of nonlinear terms was the most challeng- 
ing part, and it required the use of Birkhoff-Hermite interpolation (see Lemma Q] 
and Appendix 0. For actual implementation, the automation of the computation of 
the coefficients of the scheme was crucial (see Appendix |D1). Note that as we have 
interpolated the field using third degree polynomials, the cubic nonlinearity inside 
every cell was represented by the polynomials of a relatively high degree — degree 
9. We, however, have not seen any adverse implications of that in our simulations. 

Let us also mention that the piecewise interpolating polynomial of the Birkhoff- 
Hermite type0 is not, generally speaking, equivalent to the standard Schoenberg 
cubic spline (see, e.g., [24, Section 2.3.2]). Indeed, the Schoenberg spline takes 
only the nodal values of the interpolated function as input, and is built so that its 
first and second derivatives are continuous at the nodes. In doing so, the equations 
for the coefficients of the spline become coupled across the entire grid. In contradis- 
tinction to that, all individual polynomials of the Birkhoff-Hermite interpolation are 
built independently of one another inside their respective cells. Moreover, the nodal 

9 This is a function on [0, Z max ] that on every grid cell coincides with the corresponding 
cubic polynomial obtained by Lemma [T] 
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values of the second derivative are required as input in addition to the nodal values 
of the function. In doing so, even if the first and second derivatives of the interpo- 
lated function are continuous everywhere, the first derivative of the interpolating 
polynomial may be discontinuous at the nodes. However, the mismatch may not 
exceed 0(h 3 ), because inside every cell the first derivative is approximated with 
third order accuracy. 

Note also that as an alternative to the integral formulation (Section 12.11) and the fi- 
nite volume scheme built uniformly across the entire domain, one could have used 
compact finite differences on the regions of smoothness coupled with the condi- 
tion of continuity of the first derivative at the interfaces. The latter can be built into 
the scheme, say, via one-sided differences. This approach, however, is not equiva- 
lent to ours and may, in our opinion, come short at least along the following two 
lines. First, the uniformity of the approach will be lost — another discontinuity 
introduced in the domain will require special treatment and hence the scheme will 
have to be rebuilt. Second, compactness of the approximation will be compromised 
because of the long one-sided stencils near the discontinuities and as such, the re- 
sulting matrix will have a higher bandwidth. 

The non-reflecting two-way artificial boundary conditions (Section 12.51 ) were de- 
signed similarly to our previous work [11-13]; they are based on the analysis of 
the waves governed by the discrete equation. The boundary conditions prescribe 
the impinging wave that drives the problem and at the same time enable the prop- 
agation of all the outgoing waves. The important difference compared to [11, 12] 
is, however, that for a compact scheme, even fourth order accurate, it is sufficient 
to consider only one ghost node outside the computational domain, whereas for 
the five-point central-difference approximation (l42l) we had to introduce two ghost 
nodes. Indeed, for the linear homogeneous five-point discretization, an additional 
evanescent mode always exists, which needs to be handled with care, see [11, 12]. 
For the compact three point discretization, however, no such mode exists, and the 
construction of the boundary-condition is greatly simplified. In both cases, the ad- 
ditional assumption that we used when calculating the value of the solution at the 
ghost nodes is that outside the domain of interest the field is governed by the linear 
constant coefficient Helmholtz equation. 

The analysis in the paper establishes the formal accuracy of our schemes (i.e., it 
is the analysis of consistency). We do not, however, derive any rigorous error esti- 
mates because the problem is nonlinear. Instead, we study the computational error 
experimentally (see Section [53]). By comparing our numerical solutions with the 
closed form solutions of [4, 5], we have been able to demonstrate that in all the 
cases our schemes possess the design rate of grid convergence. Besides, we pro- 
vide a convergence proof for a linear layered medium (see Appendix [B]). 

Our nonlinear solver for the discretized NLH exploits Newton's iterations. How- 
ever, the nonlinearity in equation < [9ab is nondifferentiable in the sense of Frechet 
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for complex-valued solutions E. Therefore, we present a convenient mechanism for 
transforming the nonlinear systems of equations to the representation in real vari- 
ables, which enables Newton's linearization. The results fully justify this additional 
effort. Indeed, Newton's iterations allow us to solve the NLH for very high nonlin- 
earities, addressing the full range of nonlinearities interesting from the standpoint 
of physics, and beyond, to the level of the actual material breakdown. Moreover, 
even though Newton's method has been applied to problems with Kerr nonlinear- 
ity previously [17], our current implementation is particularly well suited for the 
one-dimensional NLH as it yields block tridiagonal Jacobians. 

We now compare our current work with other studies available in the literature 
on the numerical solution of boundary value problems for the NLH: our previous 
work [11-13], and Suryanto et al. [14, 15]. In terms of discrete approximations, 
these previous studies did not guarantee a fourth order approximation for materi- 
als with discontinuities. We have shown this directly for the discretization of our 
work [11-13] (Section 15.3.11) . while for Suryanto's finite element discretization 
which accounts for discontinuities, the nonlinearity is approximated only to the 
second order. We again emphasize that to the best of our knowledge, the current 
method is the first ever high-order approximation of the NLH with material discon- 
tinuities. The additional improvement is due to the Newton's solver. All iterative 
schemes used previously were based on freezing the nonlinearity [11-15]. As we 
have seen, this freezing approach cannot be used beyond a certain e threshold, un- 
related to the uniqueness of the solutions. We note that the freezing approach was 
used by Suryanto et al. to solve the NLH for the cases when the solution is not 
unique. They report, however, that their setup was that of a highly-grated material 
with a defect, and that often for such setups the threshold for non-uniqueness is 
much lower (in fact, lowering of the threshold was one of the goals in [14, 15]). 
This is in agreement with our own observations: The freezing approach fails at a 
certain nonlinearity threshold unrelated to the solution uniqueness. Therefore, the 
results show that Newton's method, compared with the commonly used freezing 
approach, allows for the much high levels of nonlinearity. Apparently, this is the 
first numerical method for the NLH that works at such high nonlinearities. To sum- 
marize, compared to [11-15], the approach of the current paper enables efficient 
discrete approximation for a problem with material discontinuities and allows so- 
lution for high levels of nonlinearity. We expect that it will provide a basis for the 
future extension to the case of multiple space dimensions (see Section 1631 ) . 

Let us also mention a few additional studies that have something in common but are 
not as close to the current work. In [25], Choi and McKenna analyzed a somewhat 
different equation: Ah + u 3 = 0. They could employ the mountain pass ideas 
because the boundary condition was homogeneous Dirichlet and hence u = was 
a solution. This approach, however, will not apply to the NLH, which is normally 
to be driven by a given incoming wave at the boundary. 

In yet another series of papers, Kriegsmann and Morawetz solve a linear Helmholtz 
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equation with variable coefficients [26] and a focusing NLH [27] in two space 
dimensions, and then Bayliss, Kriegsmann and Morawetz consider a defocusing 
NLH [28]. They employ second order approximations, and the solver is based on 
integration in real time (i.e., using the wave equation) and applying the principle 
of limiting amplitude. The problem they solved is very different though, so at the 
moment we cannot compare their method with ours. 

6.2 Possible future extensions 

The method can be extended to the case of a quintic nonlinearity, a = 2. This will 
involve evaluation of the fifth order tensor coefficients [cf . formula (l22l) 1 : 

gijkim^jFiFjF.F^dC 

This is a straightforward, though tedious extension, for which the automatic gener- 
ation of tensor elements will be a necessity. 

The method can also be extended to the case of piecewise smooth material coef- 
ficients v(z) and e(z), as opposed to only piecewise constant coefficients that we 
have analyzed in the paper. Approximating the quantities u(z) and e(z) by cubic 
polynomials within each grid cell: 

3 3 

v{z m ± (h) = £ cf C j , e(z m ± 00 = £ 4C, 

j=0 k=0 

one can then substitute these approximations into formulae (fT9l ) for one-sided sec- 
ond derivatives, and into the definitions (1221) for the coefficients and g^. 

Likewise, linear and nonlinear absorption can be modeled by allowing the material 
coefficients to become complex. Note, however, that in this case the tensor elements 
gijk will also become complex, 

9ijk = J F*FjF k d(, 
and will lose their symmetry with respect to the indices i, j, k . 

Furthermore, for the linear Helmholtz equation, which corresponds to the case 
e = in this paper, the scheme we have used to approximate (PT21) to fourth or- 
der accuracy can be extended to arbitrarily high orders at virtually no computa- 
tional cost. For example, using the first three even (one-sided) derivatives at z m : 
{E^E'^Etl} and at Z( m+ i)_: {E m+ i, E" m+1 j_, E^l +1 ^_}, one can construct 
the Birkhoff-Hermite quintic polynomial: 

Ph (C; F m , E'^, E m+ , E m+ i, E" m+1 j_, i?( m+1 )_) , 
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such that 

E(z m + (h) = P 5 (() + 0(h e ) , 

and then use it to approximate the integrals in (PT21) . The values of the one-sided 
derivatives are again obtained from the equation: E^ = —k^uE" = k^E , etc. 
Note that this extension cannot be used for the NLH. 

From the standpoint of physics, a very useful extension could be that of considering 
the vectorial NLH, when no assumption of the linear polarization of the field is 
made. Building boundary conditions for this case may require special care. 

On the numerical side, to improve the quality of approximation a nonuniform grid 
can, in principle, be used that would be better suited for resolving sharp variations 
of the solution. In general, however, the structure of the solution is not known ahead 
of time, and therefore, a methodology of this type can only be adaptive. 

Improvements can also be introduced aimed at reducing the CPU time for the 
method proposed in this paper. For example, the summation 9-ijk wa s performed 
in Sections [2~4l and [331 without using the symmetry of the tensor g^, see Tabled] 
Taking it into account could decrease the cost of constructing the Jacobians. We 
believe, however, that it can only benefit the one-dimensional problem because in 
2D the overall cost will most likely be dominated by the inversion of the Jacobian. 

The extension of utmost interest to us, from the standpoint of both theory and ap- 
plications, is to the multidimensional case, see equation ©. It is well known that 
under the paraxial approximation, the NLH reduces to the nonlinear Schrodinger 
equation, which possesses singular solutions. Therefore, the question of global ex- 
istence for the solutions of the NLH in similar configurations is of a substantial 
mathematical and physical interest (see [1 1-13] and the bibliography there for more 
detail). Currently, the only analytical result in this regard is due to Sever [29], who 
proved existence for the NLH with real Robin boundary conditions. However, the 
radiation boundary conditions which model the physical problem do not lead to 
linearized self-adjoint formulations. Hence, the question of global existence in this 
case remains outstanding. 

We re-emphasize that none of the shooting-type methods [4,5,7-10] that are ap- 
parently faster than ours in ID can be generalized to multiple space dimensions. 
Hence, the only viable option in multi-D is to approximate on the grid and solve 
numerically the boundary value problem for the NLH. Construction of a compact 
finite volume discretization in multi-D is possible, although it will not be an auto- 
matic generalization of what has been done in the ID case. The use of Newton's 
method will be of foremost importance, because as we have seen, a simpler itera- 
tion scheme has severe convergence limitations. Hence, the key contribution to the 
overall computational cost in multi-D will be from the inversion of the Jacobians 
— large, sparse, non-Hermitian matrices. The use of direct solvers does not seem 
feasible for those dimensions that will provide a sufficiently fine grid resolution. 
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The only viable alternative is the preconditioned Krylov subspace iterations, and 
as such, finding a good preconditioner will be in the focus of the study. Besides, 
convergence of Newton's iterations slows down if two solutions in the region of 
nonuniqueness are close to one another, such as near the switchback points in Fig- 
ure [3] In the multidimensional case, however, we do not know the structure of the 
NLH solutions ahead of time. Thus, numerical experiments will play a key role for 
fine-tuning the method. 



A Continuity conditions at material interfaces 



For a linearly polarized plane wave that impinges normally on the interface z = 
const, we may assume without loss of generality that the electromagnetic field has 
the form: 

E = [Ei, 0, 0] , H = [0,# 2 ,0]. 

The tangential component of the electric field must be continuous across the inter- 
face, see, e.g., [30,31]. In our case, this implies the continuity of E\ = Ei(z). The 
same is also true for the tangential component of the magnetic field H 2 = H 2 (z), 
which we do not consider explicitly in the current framework. The continuity of 
H 2 , however, allows us to establish another important condition for Ei. The time- 
harmonic form of the Faraday's law (a part of the Maxwell system of equations) 
reads: 

- = curlE, 

c 

and taking into account that E s = we have: 

iufj, dEi 8E 3 dEi 
c 2 dz dx dz 

Then, disregarding all possible magnetization effects, i.e., assuming that the mag- 
netic permeability is equal to 1 (which is certainly legitimate for optical frequen- 
cies), we obtain that the first derivative of the electric field = dE j-*' is also 
continuous across any interface z = const, and hence everywhere. 



B Error estimate in the linear case 



For a linear medium (e = 0) and piecewise constant refraction index u(z), we will 
show that the fourth order scheme (I23al) indeed converges with the design rate of 
0{h 4 ) as h — > 0. Let v (z) be a step function: 

t \ _ f^eft = 1, z < 

V \Z) — \ , Z fc [— Z- max , Z- max J . 

I bright T 1, Z > 
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Let the solution be driven by the impinging wave E^ 
it satisfy the boundary conditions [cf. formulae d9bVl: 



= e lk ° z , and let 



ik 



d_ 

dz 



E 



2iko, 



z=-Z n 



dz 



E 



z=Z n 



where k x = ^/u^ko. 

Then, the continuous solution of the problem is 



E{z) 



e ik Z + Re ~ik Z z < Q 



Te ikiz 



Z>0, 



where the transmission and reflection coefficients are given by 



T 



bright /^left 



and R 



1 — J bright /^left 



bright/ ^left 



(B.l) 



On the uniform grid z m = mh, m = 0, ±1, ±2, . . ., we define: 

z/i c ft, m < 0, 
/right, m > 0. 

The fourth order discretization (|23a|) then reduces to 

Li(v m -i)E m -i — {Lo{v m _ 1 ) + L Q (u m ))E m + Li(u m )E m+1 = 0, (B.2) 

where 

L (u) = h~ 2 --u —u 2 h 2 and LAv) = h~ 2 + -u + — u 2 h 2 , 

0K J 3 128 u ' 6 384 

and /i = /ifc . For m < and, independently, for m > 0, the fundamental set of 
solutions of the difference equation (IB.2I) is {g™, q~ m }, where q v and q^ 1 are roots 
of the characteristic equation Li(v)q — 2L (u) + Li(v)q~ l = 0, and v = ^ lcft or 
f r i g ht, respectively. The roots are given by the following expressions: 



q v = L Q /Li + iy/l - (L /Li) 2 , q v X = q* v - 

The Taylor expansion yields: q u = e l ^ h + 0(h 5 ), which means that q™ approx- 
imates the right-traveling wave e l ^ k oz m = e i^/vk hm^ w ^ e | ts conjugate g~ m ap- 
proximates the left-traveling wave e~ l ^ koZm = e -' l ^ k o hm ? with fourth order accu- 
racy on any finite interval of the independent variable z: 

max lo™ - e ^ fcoz H < const • /i 4 , 
max \q~ m - e ~ l ^ k ° Zm \ < const • h 4 . 
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Similarly to (IB .lb . the discrete solution of the problem is constructed in the form: 



m l^ {h) C g ht, m > 0, 

where gi cft = f q v for u = v\ cit , g rig ht = 1v for 1/ = z/ r i g ht, and the reflection and 
transmission coefficients are obtained from the condition of continuity at m = 0: 

1 + j rW=T w , (B.5a) 

and from the difference equation ( IB .21 ) at m = 0, which reads: 

£l(^lcft)(Cft+# (/l) <?left) 

-(L (z/ lcft ) + L (z/ right ))TW (B.5b) 
+ ^i(fright)T (/l) g r i gh t = 0. 



Solving the system of equations (IB .51) with respect to i?^ and and using the 
Taylor expansion of the resulting solution, one can show that R( h ' = R + O (/i 4 ) 
and T^ h > = T + O (h A ). These relations, along with estimates (IB.3I) . imply that the 
discrete solution E m of (IB .41) converges to the continuous solution E(z) of (IB . 1 1) 
with the rate O (h 4 ) ash — ► 0. 



C Birkhoff-Hermite interpolation (proof of Lemma [B 



A large body of work has been done by different authors on Birkhoff-Hermite in- 
terpolation. Nonetheless, for the completeness of our analysis we present an ele- 
mentary convergence proof in the case of cubic polynomials. It is self-contained 
and does not require any additional facts from the literature. 

It will be convenient to make the change of variables: x = z — z m+ i, so that 



x G 



h h 

' 2 ' 2 



With respect to the new coordinate x, the material discontinuities 

are allowed at x ± |, whereas on the interval (— |, |) and, in particular, at the cell 
center x = 0, the solution is C°°. 



A cubic polynomial P 3 (x) that satisfies P 3 f ±|) = E ±tL and (±f ) = E" h is 




6 -2 V2 /i 









(r 


-J 









(C.l) 
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It is unique since the four parameters E±h, E" h uniquely determine the four coef- 

ficients Cj of /^(x) = Ylj=o c j x ^ vm me solution of the corresponding 4x4 linear 
system. 

Next, we prove that the polynomial P-s(x) is indeed a fourth order approximation 
of the field E(x). Differentiating P 3 (x) of (IC.ll ) three times at x = 0, we have: 



Eh 

2 



E_H 

2 



h 2 E 'L + E"_ 



Eh 

2 



E_h 

2 



El + E" 



P 3 (3) (0) 



24 



T?" TP" 
£Jh — &_h 
2 2_ 

h 



Then, expressing P , h and i?" ^ with the help of the Taylor formulae for E(x) and 

2 ± 2 

E"(x) at x = 0, we obtain: 



h 



P 3 (0) = P(0) + — £"(0) + O /i 4 - - (E"(0) + 0(h 



E(0) + 0[h 
h 2 



E'{0) + —E {3 \0) + O[h 4 



p-m 

= E'(0) + O (V 
P»(0) = E"(0) + O (h 2 
P 3 (3) (0) =P (3) (0) + C>(V). 



^(E^(0) + O(h 2 



(C2) 



Since P^{x) is a cubic polynomial, we can write: 



^Pt\o), 



k=0 ru - 



-X 



Moreover, as E(x) is smooth on ( — ~, ~ ), the Taylor formula yields 



E( x ) = j2^ip-x k + 0(h% xe 1 



k=0 



2 2 



Hence, using equalities (IC.2I) . we obtain: 



P 3 (x) - E(x) = £ ^W-^W ^ + c ^ = Q x e 



k=0 



k\ 



h h s 
2' 2 
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D Software engineering 



Although calculating the 64 coefficients in (1221) is straightforward, it is a te- 
dious and error prone task. As such, it is a natural choice for automation. Note that 
automation will become an absolute necessity should we wish to extend the method 
of this paper, say, to a quintic nonlinearity , or a multidimensional setting. 

In the current paper, we developed simple scripts which automate the calculation 
of the constants g^. The general approach is to use a template file to generate a 
different Maple script for each coefficient. For i, j, k = 0, . . . , 3 , Maple's symbolic 
utilities calculate the function gij k (u,h), while its code generation utilities then 
generate the required Matlab function to evaluate the expression. 

The scripts are available under the GPL license at the following URL: 

|http : / /www .tau.ac. il/~guybar/ 1DNLH| . 
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